Parsing strategies with 'lexicalized' grammars: application to Tree Adjoining Grammars
نویسندگان
چکیده
In this paper, we present a parsing strategy that arose from the development of an Earley-type parsing algorithm for TAGs (Schabes and Joshi 1988) and from some recent linguistic work in TAGs (Abeillé: 1988a). In our approach, each elementary structure is systematically associated with a lexical head. These structures specify extended domains of locality (as compared to a context-free grammar) over which constraints can be stated. These constraints either hold within the elementary structure itself or specify what other structures can be composed with a given elementary structure. The 'grammar' consists of a lexicon where each lexical item is associated with a finite number of structures for which that item is the head. There are no separate grammar rules. There are, of course, 'rules' which tell us how these structures are composed. A grammar of this form will be said to be 'lexicalized'. We show that in general context-free grammars cannot be 'lexicalized'. We then show how a 'lexicalized' grammar naturally follows from the extended domain of locality of TAGs and examine briefly some of the linguistic implications of our approach. A general parsing strategy for 'lexicalized' grammars is discussed. In the first stage, the parser selects a set of elementary structures associated with the lexical items in the input sentence, and in the second stage the sentence is parsed with respect to this set. The strategy is independent of nature of the elementary structures in the underlying grammar. However, we focus our attention on TAGs. Since the set of trees selected at the end of the first stage is not infinite, the parser can use in principle any search strategy. Thus, in particular, a topdown strategy can be used since problems due to recursive structures are eliminated. We then explain how the Earley-type parser for TAGs can be modified to take advantage of this approach. Comments University of Pennsylvania Department of Computer and Information Science Technical Report No. MSCIS-88-65. This technical report is available at ScholarlyCommons: http://repository.upenn.edu/cis_reports/691 PARSING STRATEGIES WI.Tu 'LEXICALIZED' GRAMMARS: APPLICARONS TO TREE ADJOINING GRAMMARS Yves Schabes, Anne Abeille and Aravind K. Joshi MS-CIS-88-65 LlNC LAB 126 Department of Computer and Information Science School of Engineering and Applied Science University of Pennsylvania Philadelphia, PA 19104
منابع مشابه
Which rules for the robust parsing of spoken utterances with Lexicalized Tree Adjoining Grammars?
In the context of spoken dialogue systems, we investigated a bottom-up robust parsing for LTAG (Lexicalized Tree Adjoining Grammars) that interleaves a syntactic and a semantic structure. When the regular syntactic composition rules fail, the syntactic islands and the corresponding partial semantic structures are combined thanks to additional local rules. We supply some descriptive limits of th...
متن کاملSome Experiments on Indicators of Parsing Complexity for Lexicalized Grammars
In this paper, we identify syntactic lexical ambiguity and sentence complexity as factors that contribute to parsing complexity in fully lexicalized grammar formalisms such as Lexicalized Tree Adjoining Grammars. We also report on experiments that explore the effects of these factors on parsing complexity. We discuss how these constraints can be exploited in improving efficiency of parsers for ...
متن کاملLexicalization and Grammar Development Lexicalization and Grammar Development
In this paper we present a fully lexicalized grammar formalism as a particularly attractive framework for the specification of natural language grammars. We discuss in detail Feature-based, Lexicalized Tree Adjoining Grammars (FB-LTAGs), a representative of the class of lexicalized grammars. We illustrate the advantages of lexicalized grammars in various contexts of natural language processing,...
متن کاملLexicalization and Grammar Development
In this paper we present a fully lexicalized grammar formalism as a particularly attractive framework for the specification of natural language grammars. We discuss in detail Feature-based, Lexicalized Tree Adjoining Grammars (FB-LTAGs), a representative of the class of lexicalized grammars. We illustrate the advantages of lexicalized grammars in various contexts of natural language processing,...
متن کاملA Python-based Interface for Wide Coverage Lexicalized Tree-adjoining Grammars
This paper describes the design and implementation of a Python-based interface for wide coverage Lexicalized Tree-adjoining Grammars. The grammars are part of the XTAGGrammar project at the University of Pennsylvania, which were hand-written and semi-automatically curated to parse real-world corpora. We provide an interface to the wide coverage English and Korean XTAG grammars. Each XTAG gramma...
متن کامل